i/1 


AD-A161  341  EFFICIENCY  LOSS  UITH  THE  KAPLAN-MEIER  ESTIHATORCU) 

FLORIDA  STATE  UNIV  TALLAHASSEE  OEPT  OF  STATISTICS 
N  HOLLANDER  ET  AL  AUG  85  FSU-STATISTICS-N707 
UNCLASSIFIED  AFOSR-TR-85-0975  F49S20-85-C-0007  F/G  12/1  NL 


AFOSR - TR 


ex 


CO 


CD 


< 

I 

D 

< 


Efficiency  Loss 

with  the  Kaplan-Meier  Estimator 
by 

Myles  Hollander,  Frank  Pros chan,  and  James  Sconing 


FSU  Statistics 


>rfc-M707..  Ttf- 


:^JRepoi  ,  p 

AFOSR  Technical  Report  No.  85-181 


August,  1985 


The  Florida  State  University 
Department  of  Statistics 
Tallahassee,  Florida  32306-3033 


Research  sponsored  by  the  Air  Force  Office  of  Scientific  Research,  AFSC,  USAF, 
under  Grant  AFOSR  85-C-0007.  The  U.S.  Government  is  authorized  to  reproduce 
and  distribute  reprints  for  Governmental  purposes  notwithstanding  any  copyright 
notation  thereon. 

1981  AMS  Subject  Classification:  Primary  62G20;  Secondary  62N05 

Key  Words  and  Phrases:  Censored  model,  Kaplan-Meier  estimator.  Proportional 
hazards. 


85  11  15.03T 


Efficiency  Loss 

with  the  Kaplan-Meier  Estimator 
by 

Myles  Hollander,  Frank  Proschan,  and  James  Sconing 


ABSTRACT 


We  consider  the  proportional  hazards  model  where  the  distribution  G  of  the 
censoring  random  variable  is  related  to  the  distribution  F  of  the  lifetime  ran- 

g 

dom  variable  via  (1  -  G)  -  (1  -  F)  .  Nonparametric  estimators  of  F  are  developed 
for  the  case  where  0  is  unknown  and  the  case  where  B  is  known.  Of  interest  in 
their  own  right,  these  estimators  also  enable  us  to  study  the  robustness  of  the 
Kaplan-Meier  estimator  (KME)  in  a  nonparametric  model  for  which  it  is  not  the 
preferred  estimator.  Comparisons  are  based  on  asymptotic  efficiencies  and  exact 
mean  square  errors.  We  also  compare  the  KME  to  the  empirical  survival  function, 
thereby  providing,  in  a  nonparametric  setting,  a  measure  of  the  loss  in  effi¬ 
ciency  due  to  the  presence  of  censoring. 


QUALITY 

inspected 


1.  INTRODUCTION 


In  the  usual  censorship  model  we  wish  to  estimate  a  life  distribution 
F(x)«P(XiSx)  when  lifelengths  Xj,  X2,  ...»  X^,  independent  and  identically  dis¬ 
tributed  (i.i.d.)  from  F  are  under  censorship  by  Y^,  . ..,  Y^,  i.i.d.  from  cen¬ 
soring  distribution  G.  Xi  and  Y^  are  mutually  independent  for  i = 1 ,  ...»  n  and 
F  and  G  are  continuous  with  densities  f  and  g  which  are  strictly  positive  on 
[0,  «).  The  actual  observations  consist  of  (Zi#  6^),  i  =  l,  . ..,  n,  where 
Z^  =  min(X^,  Y.)  and  6^  =  I(Xj£Y.)  where  1(A)  is  the  indicator  function  of  the 
set  A. 

As  an  estimator  of  the  survival  function  S(t)  = 1  -  F (t) ,  the  Kaplan-Meier 
(1958)  estimator  (KME)  has  received  considerable  attention.  It  is  defined  as 


SK(t)  =  H  c*(i)  I(Z(n)  at),  t  e  (0 ,  ») ,  (1.1) 

W* 

where  c^n»  (n- i)(n- i  ♦  l)"1,  Z^^  <  ...  <Z^  are  the  ordered  Z^‘s,  and  6^  is 
the  6  corresponding  to  .  The  product  over  an  empty  set  is  defined  to  be 
zero.  Some  authors  (cf.  Wellner  1985)  use  a  slightly  different  version  of  the 
KME  defined  by 


S„(t)  «  n  c  ji),  t£(0,  «). 
z  <♦  in 

Z(i)st 


(1.2) 


Equation  (1.2)  differs  from  (1.1)  on  [Z^,  ®)  if  “0.  While  (1.1)  is 
always  zero  on  lZ(n)»  ®)>  (1.2)  is  strictly  positive  there  if  6^  =0  and  thus 
in  some  samples  §K  is  not  a  true  distribution  function. 

The  KME  has  been  studied  in  great  detail.  Weak  convergence  has  been  studied 
by  Efron  (1967),  Breslow  and  Crowley  (1974),  Meier  (1975),  Gill  (1983),  and 
Wellner  (1985) .  Strong  consistency  was  established  by  Peterson  (1977)  and 


1 

$ 


Langberg,  Pros chan,  and  Quinzi  (1980).  Optimality  properties  were  established 
by  Wellner  (1982).  Small-sample  properties  have  been  studied  by  Chen,  Hollander, 
and  Langberg  (1982),  and  Wellner  (1985).  Most  of  the  properties  developed  in 
these  papers  require  only  minimal  assumptions  (e.g.,  continuity  of  F  and  G) . 

The  KME  is  also  the  generalized  maximum  likelihood  estimator.  These  properties 
along  with  the  ease  of  computation,  ease  of  interpretation,  and  easily  estimated 
asymptotic  variance  (Greenwood's  formula*)  have  made  the  KME  standard  for  esti- 
mating~S(t) . 

Miller  (1983)  terms  the  KME  "seductive"  in  that  it  is  very  tempting  to  use. 
He  studies  the  KME’s  efficiency  loss  when  compared  to  the  maximum  likelihood 
estimate  (MLE)  in  parametric  models.  Emoto  (1984)  compares  the  KME  with 
parametric  MLE’s  on  the  basis  of  mean  square  error.  She  considers  both  the  case 
when  the  parametric  model  is  correctly  specified  and  the  case  when  it  is  mis- 
specified.  Not  surprisingly  the  KME  performs  poorly  compared  to  MLE 's  in  a 
fully  parametric  setting.  For  example  for  F  and  G  exponential.  Miller  (1983) 
shows  that  the  asymptotic  efficiency  of  the  KME  with  respect  to  the  MLE  tends  to 
zero  as  t  +  0  and  as  t  +  «. 

We  study  the  properties  of  the  KME  by  considering  the  proportional  hazards 
model  which  lies  between  the  parametric  model  and  the  fully  nonparametric  model. 
The  proportional  hazards  model  is  nonparametric  in  the  sense  that  F  is  unknown, 
but  it  possesses  more  structure  than  the  fully  nonparametric  model  assumed  for 
the  KME.  By  considering  the  proportional  hazards  model  we  can  see  how  well  the 
KME  performs  in  a  setting  for  which  it  is  not  optimal ,  thus  investigating  its 
robustness.  Furthermore,  our  efficiency  results  in  conjunction  with  those  of 
Miller  (1983)  and  Emoto  (1984)  allow  us  to  determine  the  degree  to  which  the 
KME  efficiency  losses  are  due  to  (1)  full  parametrization  of  the  distribution 
of  X  and  Y  and  (2)  the  presence  of  additional  structure  governing  X  and  Y. 
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The  proportional  hazards  model  is: 


Definition  1.1.  (X,  Y)  follows  a  proportional  hazards  model  if  for  some 

B>0, 


l-G(t)  *{S(t)}&,  t  €  (0,  -). 


(1.3) 


Expression  (1.3)  is  equivalent  to 

RG(t)  «0Rp(t),  te  (0,  -),  (1.4) 

where  Rp(t)  *  -  logS(t),  Rg(t)  »  -  log(l-G(t)),  the  cumulative  hazard  functions 
of  F  and  G  respectively. 

Proportional  hazards  has  been  used  in  censored  models  in  the  past.  Efron 
(1967)  uses  the  special  case  of  exponential  random  variables  to  compare  effi¬ 
ciencies  for  various  two-sample  tests.  Koziol  and  Green  (1976)  derive  a  Cramer- 
von  Mises  statistic  for  testing  a  goodness-of-fit  hypothesis  that  F,sF0*  F0 
completely  specified.  C sorgo  and  Horvath  (1984)  improved  upon  the  Koziol-Green 
test  in  that  Koziol  and  Green  required  that  0  be  known  whereas  CsorgB  and  Horvath 
do  not  need  this  assumption.  Chen,  Hollander,  and  Langberg  (1982)  and  Wellner 
(1985)  use  proportional  hazards  to  compute  moments  of  the  KME.  Chen,  Hollander, 
and  Langberg  use  the  form  of  the  KME  listed  in  (1.1)  while  Wellner  uses  (1.2). 

In  Section  2  we  develop  an  estimator  Sp  (2.3)  for  estimating  S  in  the  pro¬ 
portional  hazards  model  when  0  is  unknown.  We  compare  §p  with  the  KME  in  terms 
of  asymptotic  efficiency,  exact  bias  and  exact  mean  square  error .  In  Section  3 
we  advocate  the  maximum  likelihood  estimator  Sp  (3.1)  of  S  in  the  case  of  pro¬ 
portional  hazards  when  0  is  known.  One  efficiency  result  is  that  the  asymptotic 
efficiency  of  the  KME  with  respect  to  the  MLE  is  (8  +  1)“*.  Since  (8  +  1)  1  is 
equal  to  P(X<Y),  this  is  a  readily  interpretable  measure  of  efficiency.  In 


Section  4  the  KME  is  compared  with  the  empirical  survival  function.  This  com 
parison  provides  a  measure  of  the  efficiency  loss  due  to  censoring. 


2.  PROPORTIONAL  HAZARDS.  PROPORTIONALITY  CONSTANT  UNKNOWN 

Assume  that  the  proportional  hazards  model  is  known  to  hold.  Then  the  KME 
is  no  longer  the  generalized  maximum  likelihood  estimator.  The  extra  information 

8  -1  r 

that  l-G(p)  *  (S(p))p  should  be  utilized.  Let  T  *  n  J,  Th®n  T  is  asymp- 

-1  n  i=1  1 

totically  normal  with  mean  (8*1)  =  P(X  <  Y)  and  asymptotic  variance: 

AV(n1/2Tn)  *8{(P+ l)'2}.  (2.1) 

Let  H(t)  be  the  survival  distribution  for  Z.  Then  H(t)  =  {S(t)}®+*.  Let 

i  n 

H  (t)  =n  )  I(Z.  >t),  the  usual  empirical  survival  estimator  for  H(t) .  Then 
n  .  ,  l 

i=l 

A(t)  =n^^2{Hn(t)  -H(t)}  converges  weakly  to  a  Gaussian  process  with  mean  0  and 
covariance  structure,  for  sst,  given  by 

Cov(A(s) ,  A(t)}  =  (1  -  H(s))H(t)  for  0  <  s  <:  t  <  ®.  (2.2) 

Now  our  goal  is  to  estimate  S(t)  =  {H(t)}*^®+1^ .  A  natural  choice  is 

Sp(t)  -  (Hn(t))Tn  for  t  c  (0,  -) .  (2.3) 

We  use  the  following  result  of  Allen  (1963) . 

Theorem  2.1.  The  pair  (X,  Y) ,  0<P(X<Y)  <  1,  follows  the  proportional 
hazards  model  if  and  only  if  the  random  variables  Z  =  min(X,  Y)  and  6*I(XsY) 
are  independent. 

From  Theorem  2.1;it  follows  that  the  random  vectors  (Z^ ,  ...,  Z^)  and 
(6.,  ...»  6  )  are  independent  under  the  proportional  hazards  model.  Thus  the 


statistics  and  Hn(t)  are  independent.  This,  together  with  (2.1),  (2.2)  and 
the  fact  that  g(x,  y)  =  has  first  partial  derivatives  which  are  continuous  at 
[(1  -  H(t))H(t) ,  0(0  +  l)“2]  imply  that  Sp(t)  converges  weakly  to  a  Gaussian  pro¬ 
cess  (cf.  Serf ling  pg.  124)  with  mean  0  and  asymptotic  variance  given  by,  for 
t  e  (0,  «), 

AV(n1/2Sp(t) } * 

(8+l)‘2(H(t)fl"B)(B+1)  1(l-H(t)}  +${(8  +  l)‘2KlogH(t)>2(H(t)}2/(e+l)  (2.4) 


or,  equivalently, 

AV{n1/2Sp(t)}  =  (B+l)‘2(S(t))1“6[l-  (S(t))B+1)  ♦  8(logS(t))2{S(t)}2.  (2.5) 

It  is  interesting  to  note  that  the  asymptotic  variance  of  Sp  may  decrease 
as  0  increases.  This  is  not  true  of  the  asymptotic  variance  of  the  KME.  From 
(2.5),  we  find 


ik  lAV{n 


l/2i 


sD(t)>] 


-2(0  +  1) 


-3. 


[(S(t)> 


1-P 


-(soon  -  (0  +  D 


' 2 { S  Ct ) > 1 " 6 lo-S ( t ) 


+  (S(t)  loy  5(t) } 


(2.6) 

2 


For  0  in  a  neighborhood  of  1  and  t  close  to  0,  the  right-hand  side  of  (2.6)  is 
less  than  zero.  It  seems  counterintuitive  that  an  estimator  should  improve  as 
censoring  increases.  However,  note  that  when  0  is  close  to  1,  the  distribution 
of  Y  is  almost  the  same  as  that  of  X.  Consequently  observing  Y  is  almost  as 
informative  as  observing  X.  Thus  this  result  is  not  surprising  after  all. 

Note  that  the  estimator  in  (2.3)  junps  at  both  the  observed  X's  and  the 


observed  Y's.  Ebrahimi  (1984)  proposed  an  estimator  in  the  proportional  hazards 
model  which  jumps  only  at  the  observed  X's .  Also  note  that  the  estimator  in 


(2.3)  drops  to  zero  after  with  one  exception.  In  the  case  where  T^  =  0, 

Sp(t)  si.  In  this  pathological  case  our  estimate  for  6  is  infinite. 

§_  is  also  strongly  consistent.  Note  that  H  (t)  a4-s*  H(t)  and  T  a+s‘T  by 
r  n  n 

the  strong  law  of  large  numbers.  Since  g(x,  y)  =  x?  is  a  continuous  function. 

Sp(t)  V'  {H(t)}1/(1+B)  =S(t)[(l-G(t)}1/(0+1){S(t))"B/(8+1)]. 

If  the  proportional  hazards  model  holds  then  the  term  <{>(t)  = 

[{1  -  G(t)}1^0+1^{S(t)}'0^0+1^]  reduces  to  1.  If  the  proportional  hazards  model 
does  not  hold  then  the  term  <{>(t)  is  a  contaminating  factor.  The  error  in  the 
estimator  then  depends  on  how  far  <Kt)  diverges  from  1. 

From  (2.4)  it  is  seen  that  the  asymptotic  variance  can  be  estimated  by 

AV(n1/2Sp (t)  }  =  T2  (~±) 1  ‘ 21  (^)  ♦  T(  1  -  T)  { log  (^)  )  2 (^)  2T, 

for  t<  Z(i+i)’  This  h°Ids  only  for  t<  .  Note  also  that  if  8  =  0,  (2.4) 

reduces  to  S(t) {1  -  S(t) },  the  asymptotic  variance  of  the  usual  empirical  sur¬ 
vival  function. 

To  compare  Sp  with  the  KME,  the  asymptotic  variance  for  the  KME  under  the 
proportional  hazards  model  must  be  computed.  The  estimator  §^(t)  is  asymptoti¬ 
cally  normal  with  asymptotic  variance  (cf.  Miller,  1981): 

AV{n1/2S  (t))  =  {S(t))2/J - - .  (2.7) 

*  °{S(u) }  il  -  G(u) } 

If  1  -  G  * S0  then  (2.7)  reduces  to 

AV{n1/2§K(t)}  =  (8*  l)"1{S(t)}2[{S(t))"(B+1)  -  1].  (2.8) 

The  ratio  of  (2.5)  to  (2.8)  is  then 

a,(t)  d*fe(S„,  Sp)  *  (8*  l)'1  ♦  8(8*  1)  (logS(t)  }2[  (S(t)  )" (0+1)  -  1] 


(2.9) 


Theorem  2.2.  The  function  a^(t)  has  the  following  properties: 

i)  lima  (t)  ■  (P+ l)"1. 

t  -*■  0  x 

ii)  lima  (t)  =  (g  +  l)-1. 
t  -*■<*>  1 

iii)  a^t)  s  (g  +  l)-1,  0  <  t  < “. 

Proof :  (i)  Use  L’Hospital’s  rule  on  the  second  term  in  (2.9)  to  obtain 

lima  (t)  =  (g*  l)'1*  lim  -  2g(log  S(t)}{S(t)  }  (S  +  1}  =  (g  +  l)-1. 
t->-0  1  t  -*■  0 

(ii)  Use  L'Hospital's  rule  twice  on  the  second  term  in  (2.9)  to  obtain 

lima  (t)  *  (g+  l)-1*  lim-2g((g+  lj‘1HS(t))(6+  15  =  (g+  l)'1. 

t  ^  -V  oo 

(iii)  Note  that  the  second  term  in  (2.9)  is  always  positive.  || 

Table  1  gives  some  values  for  a^(t)  for  X  exponential  with  parameter  1  and 
Y  exponential  with  parameter  g.  Note  that  the  values  for  a^t)  initially  in¬ 
crease  and  then  decrease.  The  value  of  t  for  which  this  change  occurs  is  given 
by  the  solution  to  the  equation  (6  +  l)t  *  2 [  1  -  exp{  -  t(g  +1)}].  Table  1  also 
suggests  that  a^t)  decreases  as  g  increases,  g  increasing  is  equivalent  to 
censoring  increasing  stochastically.  Thus  Table  1  suggests  that  the  efficiency 
of  the  KME  with  respect  to  Sp  decreases  as  censoring  increases  stochastically. 

We  have  been  unable  to  prove  this. 

While  Table  1  gives  values  for  X  and  Y  exponential,  these  values  hold  for 
any  proportional  hazards  model.  Consider  the  random  variables  R(X)  and  R(Y) , 
where  R(*)  is  the  cumulative  hazard  function.  Then  R(X)  and  R(Y)  are  exponential 
random  variables  with  parameters  1  and  g  respectively.-  To  find  the  efficiency 
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of  the  KME  with  respect  to  Sp(t)  for  this  case,  compute  R(t)  and  use  Table  1 
with  R(t)  in  place  of  t. 

Finite  sample  comparisons  can  also  be  made  using  the  method  of  Chen, 
Hollander,  and  Langberg  (1982).  These  authors  compute  bias  and  variance  for  the 
KME  under  the  proportional  hazards  model.  Wellner  (1985)  does  the  same  using 
(1.2)  rather  than  (1.1).  These  methods  can  also  be  applied  to  Sp.  This  gives 


E{Sp(t)}“  = 


(2.10) 


?  I  (")(J)(IL:Li-)ak/n{s(t)}(n";j)(6+1)[i-{s(t)}e+1]:i  •  {(b  +  i)‘V{e/(B+i)}n_:i. 

j=0  k=0  }  K  n 

We  use  (2.10)  to  calculate  bias  and  mean  square  error  for  Sp(t)  when  X  and 
Y  are  exponential  with  parameters  1  and  g  respectively.  Table  2  gives  numerical 
values  for  bias  and  mean  square  error  for  $K(t) ,  §K(t),  and  §p(t).  The  values 
for  and  SR  are  obtained  from  Wellner  (1985).  From  Table  2  we  see  that  Sp  is 
biased  high;  in  fact,  its  bias  exceeds  that  of  both  KME's.  This  is  perhaps  due 
to  the  pathological  case  T  =  0.  The  mean  square  error  however  is  typically  small¬ 
er  than  that  of  the  KME,  particularly  when  B  =  t  =  2.0.  The  cases  for  which  the 
mean  square  error  of  the  KME  is  smaller  seem  to  correspond  to  the  cases  for  which 
the  bias  of  Sp  is  large  compared  to  that  of  the  KME.  The  mean  square  error  and 
bias  for  §p  tend  to  increase  in  8  and  decrease  in  n.  (An  exception  occurs  in 
the  bias  values  for  g  =  t  =  2.0.)  The  values  for  the  general  proportional  hazards 
case  can  be  obtained,  as  previously  seen,  by  considering  exponential  variables 
with  R(t),  the  hazard  rate,  taking  the  place  of  t. 


3.  PROPORTIONAL  HAZARDS.  PROPORTIONALITY  CONSTANT  KNOWN 

Suppose  the  proportional  hazards  model  is  known  to  hold  with  3  known.  In 
this  case  an  estimator  analagous  to  Sp  is : 

Sp(t)  =  (Hn(t)  }Y  for  t  e  (0,  »)  .  (3. 1) 

-1 

where  y  =  (6 +  1)  and  H^ft)  is  the  empirical  estimator  for  the  Z^'s. 

It  follows  (Zehna,  1966)  that  Sp(t)  is  the  maximum  likelihood  estimator  for 
S(t).  Further,  analagous  to  Section  2,  if  the  model  is  correctly  specified,  Sp(t) 
is  strongly  consistent: 

Sp(t)  a-VS’  S(t)[{l  -  G(t)}1/^+1){S(t)re/(B+1)]  =  S(t)  , 

If  the  proportional  hazards  model  does  not  hold  or  if  3  is  misspecified,  then 
Sp(t)  will  not  converge  to  S(t)  and  the  error  depends  on  how  much  the  term 
[U -G(t)}1/(B+1)(S(t)}"B/(:B+1)]  differs  from  1. 

The  estimator  Sp  converges  weakly  to  a  Gaussian  process  with  mean  S(t)  and 
asymptotic  variance  given  by: 

AV{n1/2Sp(t)} =  (B  +  l)‘2{H(t) >(1'B)/(B+1){1  -  H(t) }  (3.2) 

or,  equivalently, 

AV{n*/2Sp(t)  }  =  ($•*■  l)"2{S(t)  }1_B[1  -  (S(t)  }B+1] .  (3.3) 

From  (3.2)  the  asymptotic  variance  can  be  estimated  by 

A  1/2-  -2  n  -  i  1”B  i 

AV(n  '  sp(t)  )  =  (3  ♦  1)  C2— )  $  * 

for  Z^j  £  t  <  Z^i+1j ,  i  =  1,  ...,  n-1.  Again  the  estimator  jumps  at  both  failure 
times  and  censoring  times.  To  compare  S_  with  the  KME,  compute  the  ratio  of 


(3.3)  to  (2.8).  This  yields: 


<*2(0  ^=^e(SK,  =  +  1)  *»  independent  of  t. 

A  A 

Note  that  e(S^,  Sp)  decreases  as  8  increases.  Recall  from  Theorem  2.2  that 
(6+1)  *  is  also  the  value  of  o^(t)  at  both  extremes  of  t.  Note  that  (8+1)”*  = 
P(X<Y)  and  this  represents  the  proportion  of  values  for  which  a  failure  occurs 
Recall  that  Sp  jumps  at  both  the  observed  failure  times  and  the  observed  cen¬ 
soring  times  while  jumps  only  at  the  observed  failure  times,  n*  P(X<Y)  in 
expectation. 

As  in  Section  2,  exact  finite  sample  results  can  be  obtained.  Analogous 
calculations  yield 

E(S p(t)}a=  l  (")(H^l)a/(3  1){S(t)}Cn'j)(6+1)[l  -  (S(t)f+1]j.  (3.4) 

j=0  3 

Bias  and  mean  square  error  are  calculated  from  (3.4).  Table  3  gives  the 
values  for  the  case  X  and  Y  exponential  with  parameters  1  and  6  respectively. 
The  biases  for  the  8  known  case  are  higher  than  for  the  6  unknown  case.  The 
mean  square  errors  are  everywhere  smaller,  sometimes  half  as  small  as  those  for 
which  8  is  unknown.  Note  that  S^(t)  has  the  smallest  mean  square  error  when 
t  =  1.0  and  8  =  2.0.  However  when  t  =  2.0  and  8  =  2.0,  S^,(t)  does  substantially 
worse  than  each  of  the  other  competitors  with  mean  square  error  six  times  as 
great  as  that  of  Sp.  The  mean  square  error  and  bias  of  £>p  decrease  with  n  and 


increase  with  8. 
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4.  LOSS  IN  EFFICIENCY  DUE  TO  CENSORING 

When  there  is  no  censoring,  the  KMt  reduces  to  the  empirical  survival  func¬ 
tion,  the  latter  being  the  estimator  of  choice  in  the  fully  nonparametric  non- 
censored  model.  Thus  by  comparing  the  KME  to  the  empirical  survival  function, 
we  obtain,  in  a  nonparametric  context,  a  measure  of  efficiency  loss  due  to  the 
presence  of  censoring. 

From  (2.7)  the  asymptotic  variance  of  Sg(t)  is  given  by 

AV{n1/2SE  (t)  }  *  S  (t)  { 1  -  S  (t)  } .  (4 . 1) 

The  ratio  of  (2.7)  to  (4.1)  is  then 

a  (t)dife(S  ,  S  )  =S(t){l-S(t))'1/J - - .  (4.2) 

°{S(u))2{l-G(u)} 

a^(t)  has  the  following  interpretation.  Roughly  speaking,  requires 
n  •  e(§E>  observations  in  the  censored  model  to  do  as  well  as  §E  does  with  n 
observations  from  the  non-censored  model. 


Theorem  4.1. 

(i)  lima  (t)  =  1. 
t  +  0  5 

(ii)  lima  (t)  *<*>. 
t-*0  * 

(iii)  a3(t)  is  increasing  in  t. 

(iv)  a^ft)  increases  as  censoring  increases  stochastically. 

Proof:  (i)  We  have 


lima  (t) 

t  -*•  0  * 


»  limS(t){F(t)}'1/! 
t  -*-0  u 


{S(u)}‘{l-G(u)) 


lim  (F  (t)  }' 1  J* - - . 

t  +  0  U  (S(u))Z{l  -G(u)} 


wm 


*  s  ' '  W  “  •  *  >  :  jiVv"  .**  /*  L***1  ■ 

:n  .V  \  *0  o. 


1 


Using  (.'Hospital's  rule 


lima  (t)  =  lim - ^ - 5 - =  lim  [{1  -  G(t) }{S(t) }2]-1  =  1 . 

t  -*■  0  0  t-*>0{l  -  G(t)  }{S(t)  }  f Ct)  t  +  n 


(ii)  Let  e>0  be  given  and  choose  t1  such  that  l-G(tj)  <e  for  t>tj.  Choose 


t2  such  that  Sft^  -  S(t2)  >  (l/2)S(t1) .  Then 


lima3(t)  >  S(t2){F(t2)} 
SS(t2){F(t2)} 


f*  2 _ dFCu) 

0  (S(u) }2{1  -  G(u) } 

-1  jt2 _ dF  (u) _ 

tl{S(u)}2{l-G(u)} 


>  (2/e)S(t  )  2  (S(u) }  2{-dS(u) } 

*  Z1 


i{2S(t2)/€>[{S(u)}'1  |J2]  =  {2S(t2)/e}[{S(t2)}'1  -  (SCtj))"1]  are"1. 

(Ui) 

S(t)f(t)  [F(t){l  -  G(t)HS(t)}2]-1  +  f}. - -  [f (t)  (F(t)  }"2] 

U{S(u)}^{l-G(u)} 

-  £(t){F(t)}”1[[SCt){l  -  GCt)}]'1  -  {F(t)}"1  /J - -  ], 

{S(u)r(l  -G(u)} 


which  is  positive  if 


(F(t)  1  { 1  -  G(t)  }S(t)  ft - -  si.  (4.3) 

U  (S(u)  r(l  -G(u)} 

The  right-hand  side  of  (4.3)  is  less  than 


(F(t)  }-1S(t)  /q  -(S(u)  }”2dS(u)  =  {FWrVtJlfSWJ  |J)  -1. 


(iv)  Note  that  if  censoring  increases  stochastically,  l-G(t)  decreases  for 
every  value  of  t.  This  implies  that  a3(t)  is  increasing.  || 


These  results  indicate  that  when  t  is  small,  censoring  is  not  very  critical, 
but  as  t  increases  the  censoring  has  more  influence.  Consequently  for  function¬ 
als  of  S(t)  which  involve  large  values  of  t,  the  KME  must  be  used  with  caution. 

Acknowledgement:  We  gratefully  acknowledge  Edsel  Pena  for  checking  the  effi¬ 
ciency  expressions  and  the  bias  and  mean  square  error  calculations. 
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Table  1.  Asymptotic  Efficiency  of  with  Respect 
to  Sp  under  Proportional  Hazards  with  0  Unknown. 


E| 

.1 

.2 

1/3 

.5 

1.0 

2.0 

i 

.9186 

.8522 

.7812 

.7130 

.5903 

.5048 

.2 

.9270 

.8687 

.8082 

.7524 

.6627 

.6253 

.3 

.9344 

.8832 

.8313 

.7854 

.7189 

.7033 

.4 

.9409 

.8957 

.8509 

.8126 

.7611 

.7471 

.5 

.9466 

.9063 

.8672 

.8345 

.7910 

.7642 

.6 

.9515 

.9153 

.8806 

.8516 

.8103 

.7611 

.7 

.9556 

.9227 

.8911 

.8645 

.8208 

.7436 

.8 

.9590 

.9286 

.8993 

.8736 

.8238 

.7164 

.9 

.9618 

.9333 

.9052 

.8793 

.8208 

.6835 

.9640 

.9368 

.9091 

.8821 

.8130 

.6477 

1 

1.1 

.9656 

.9392 

.9113 

.8824 

.8016 

.6114 

1.2 

.9668 

.9406 

.9119 

.8805 

.7873 

.5760 

1.3 

.9676 

.9412 

.9112 

.8769 

.7712 

.5428 

1.4 

.9679 

.9411 

.9093 

.8718 

.7538 

.5124 

1.5 

.9679 

.9403 

.9065 

.8655 

.7358 

.4850 

1.6 

.9676 

.9389 

.9028 

.8582 

.7176 

.4608 

1.7 

.9670 

.9370 

.8985 

.8502 

.6996 

.4397 

1.8 

.9662 

.9347 

.8937 

.8417 

.6820 

.4215 

1.9 

.9651 

.9320 

.8884 

.8329 

.6652 

.4061 

2.0 

.9639 

.9291 

.8828 

.8239 

.6492 

.3930 

***-!»«■ 
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